Learning Bilingual Phrase Representations with Recurrent Neural Networks
نویسندگان
چکیده
We introduce a novel method for bilingual phrase representation with Recurrent Neural Networks (RNNs), which transforms a sequence of word feature vectors into a fixed-length phrase vector across two languages. Our method measures the difference between the vectors of sourceand target-side phrases, and can be used to predict the semantic equivalence of source and target word sequences in the phrasal translation units used in phrase-based statistical machine translation. Our experiments show that the proposed method is effective in a bilingual phrasal semantic equivalence determination task and a machine translation task.
منابع مشابه
Learning Bilingual Distributed Phrase Representations for Statistical Machine Translation
Following the idea of using distributed semantic representations to facilitate the computation of semantic similarity between translation equivalents, we propose a novel framework to learn bilingual distributed phrase representations for machine translation. We first induce vector representations for words in the source and target language respectively, in their own semantic space. These word v...
متن کاملBilingual Lexicon Induction by Learning to Combine Word-Level and Character-Level Representations
We study the problem of bilingual lexicon induction (BLI) in a setting where some translation resources are available, but unknown translations are sought for certain, possibly domain-specific terminology. We frame BLI as a classification problem for which we design a neural network based classification architecture composed of recurrent long short-term memory and deep feed forward networks. Th...
متن کاملTransduction Recursive Auto-Associative Memory: Learning Bilingual Compositional Distributed Vector Representations of Inversion Transduction Grammars
We introduce TRAAM, or Transduction RAAM, a fully bilingual generalization of Pollack’s (1990) monolingual Recursive Auto-Associative Memory neural network model, in which each distributed vector represents a bilingual constituent—i.e., an instance of a transduction rule, which specifies a relation between two monolingual constituents and how their subconstituents should be permuted. Bilingual ...
متن کاملConvolution-Enhanced Bilingual Recursive Neural Network for Bilingual Semantic Modeling
Estimating similarities at different levels of linguistic units, such as words, sub-phrases and phrases, is helpful for measuring semantic similarity of an entire bilingual phrase. In this paper, we propose a convolution-enhanced bilingual recursive neural network (ConvBRNN), which not only exploits word alignments to guide the generation of phrase structures but also integrates multiple-level ...
متن کاملBilingual Distributed Phrase Representations for Statistical Machine Translation
Phrase–based machine translation (PBMT) relies upon the phrase-table as the main resource for bilingual knowledge at decoding time. A phrase table in its basic form includes aligned phrases along with four probabilities indicating aspects of the co-occurrence statistics for each phrase pair. In this paper we add a new semantic similarity score as a statistical feature to enrich the phrase table...
متن کامل